Faculty  Working  Paper  91-0153 


330 
B385 

1991s 153 


COPY 


Quantile  Smoothing  Splines 


The  library  of  me 
MOV  ,  6  W1 

University  o\  m*te 

ot  urbana-Chanf.*^ 


Roger  Koenker 

Department  of  Economics 
University  of  Illinois 


Pin  Ng 

Department  of  Economics 
University  of  Houston 


Bureau  of  Economic  and  Business  Research 

College  of  Commerce  and  Business  Administration 

University  of  Illinois  at  Urbana-Champaign 


BEBR 


FACULTY  WORKING  PAPER  NO.  91-0153 

College  of  Commerce  and  Business  Administration 

University  of  Illinois  at  Urbana-Champaign 

July  1991 


Quantile  Smoothing  Splines 


Roger  Koenker 

Department  of  Economics 

University  of  Illinois  at  Urbana-Champaign 

Pin  Ng 

Department  of  Economics 

University  of  Houston 


Digitized  by  the  Internet  Archive 

in  2011  with  funding  from 

University  of  Illinois  Urbana-Champaign 


http://www.archive.org/details/quantilesmoothin91153koen 


Quantile  Smoothing  Splines1 
Roger  Koenker  a  and  Pin  Ng  b 

a  Department  of  Economics,  University  of  Illinois,  Champaign,  Illinois  61820,  U.S.A. 

b  Department  of  Economics,  University  of  Houston,  Houston,  TX  77204-5882,  U.S.A. 


Abstract 

Quantile  smoothing  splines  defined  as  solutions  to 


min  XPaOv  -g(Xi))  +  XJq  \g  "{x)\dx 


geC 

with  pa(w)  =  (a  -I{u  <  0))w  are  proposed  as  a  simple,  nonparametric  approach  to  estimat- 
ing conditional  quantile  functions.  We  show  that  solutions,  g,  are  parabolic  splines,  i.e. 
piecewise  quadratics,  on  the  mesh  {x\,  ...,  xn),  and  may  be  computed  by  standard  /i-type 
linear  programming  techniques.  At  X  =  0,  g  interpolates  the  a^  quantiles  at  the  distinct 
design  points,  and  for  X  sufficiently  large  g  is  the  linear  regression  quantile  fit  to  the  obser- 
vations. The  entire  path  of  solutions,  in  the  penalty  parameter  X,  may  be  efficiently  com- 
puted by  linear  parametric  programming  methods. 


1.      INTRODUCTION 

In  practice  nonparametric  regression  is  virtually  always  the  estimation  of  flexible 
models  for  conditional  mean  functions.  Some  recent  theory  has  advanced  robustified 
methods  of  estimating  alternative  measures  of  conditional  central  tendency,  e.g.,  Cox 
(1983),  Utreras  (1981),  and  Hardle  (1990).  But  aspects  other  than  the  central  tendency  of 
conditional  distributions  are  frequently  of  substantial  interest.  What,  for  example,  is  the 
seasonal  pattern  of  extreme  temperatures,  water  levels,  or  pollution  readings?  Are  estimated 
trends  in  mean  incomes  or  SAT  performance  consistent  with  trends  in  extreme  quantiles? 
Efron  (1991)  has  persuasively  argued  that  regression  percentiles  have  a  critical  role  to  play 
in  regression  diagnostics  for  generalized  linear  models. 


The  authors  wish  to  thank  Cornelius  Gutenbrunner,  Steve  Portnoy  and  Mike  Osborne  for  helpful 
discussions,  and  Esther  Portnoy  for  pointing  out  the  important  Schuette  reference.  Research  support  from  NSF 
SES  89-22472  is  also  gratefully  acknowledged. 


Several  authors  have  proposed  methods  for  nonparametric  estimation  of  such  condi- 
tional quantile  functions.  Troung  (1989)  following  the  nearest  neighbor  approach  of  Stone 
(1977),  Samanta  (1989)  and  Antoch  and  Janssen  (1989)  using  kernel  methods  and  White 
(1990)  using  neural  networks  have  all  suggested  methods  for  nonparametric  estimation  of 
conditional  quantile  functions.  Extending  the  parametric  methods  for  estimating  linear 
models  for  conditional  quantiles  of  Koenker  and  Bassett  (1978),  Hendricks  and  Koenker 
(1990)  discuss  sieve-type  regression  spline  models  and  apply  them  to  electricity  demand 
data.  Cox  (1988)  and  Jones  (1988),  reviving  an  idea  of  Bloomfield  and  Steiger  (1983),  have 
suggested  estimating  quantile  smoothing  splines  which  minimize 

£pa(y;  -£(*/))  +  X\(g  "(x)?dx  (1.1) 

where  pa(")  =  (ot-/(w  <  0))u  is  the  Czech  function  of  Koenker  and  Bassett  (1978).  Here 
the  parameter  as  [0,  1]  controls  the  quantile  of  interest,  while  X  e  R+  controls  the  smooth- 
ness of  the  resulting  estimate,  thus  generalizing  the  extensive  literature  on  1 2  smoothing 
splines  pioneered  by  Wahba  (1990).  This  is  an  intriguing  idea,  and  has  also  been  mentioned, 
for  example,  in  Eubank  (1988)  and  Utreras  (1981)  in  the  median  Pi/2(w)  =  I  w  I  case.  How- 
ever, the  resulting  quadratic  program  poses  some  serious  computational  obstacles.  Obvi- 
ously the  computational  virtues  of  the  piecewise  linear  form  of  the  first  term  of  the  objective 
function  are  sacrificed  by  the  quadratic  form  of  the  smoothness  penalty. 

One  is  thus  naturally  led  to  ask:  "why  not  replace  (g  "(x))2  in  the  penalty  by 
\g  "(x)  I  ?"  The  median  special  case  of  this  problem  has  been  studied  in  a  remarkable  paper 
by  Schuette  (1978)  in  the  actuarial  literature.  We  will  show,  expanding  on  Schuette's 
discrete  version  of  the  problem  using  finite  differences,  that  minimizing 

R^[g]  =  ^a(yi-g{xd)^X\l\g'\x)\dx  (1.2) 

1=1  u 

over  g  e  Cl[0,  1]  retains  the  linear  programming  form  of  the  parametric  version  of  the 
quantile  regression  problem  and  yields  solutions  which  are  parabolic,  i.e.,  quadratic,  splines. 
These  quantile  smoothing  splines  have  several  attractive  features.  Unlike  the  familiar  l^- 
smoothing  splines,  they  are  qualitatively  robust  to  gross  errors  in  the  observations  {>>;}. 
Like  the  /2-case,  as  X  — »  0  the  estimate  becomes  rougher:  for  A.  sufficiently  large  the  esti- 
mate coincides  with  the  bivariate  linear  regression  quantile  estimate,  while  for  X  in  a  neigh- 
borhood of  zero  the  estimate  interpolates  the  01th  sample  quantiles  at  each  distinct  design 
point.  As  X  increases  the  number  of  interpolated  observations  decreases  monotonically, 
indeed  an  important  computational  aspect  of  the  quantile  smoothing  problem  is  that  all  of 
the  distinct  solutions  corresponding  to  distinct  X  may  be  found  by  performing  a  sequence  of 
simplex  pivots  following  methods  of  parametric  linear  programming,  in  very  much  the  same 
manner  that  all  of  the  distinct  regression  quantile  solutions  may  be  found  by  parametric 
variation  of  a,  see  Koenker  and  D'Orey  (1987). 


2.      MEDIAN  SMOOTHLNG  SPLINES 

Given  observations  {(y,,  x/)  :  i  =  1,  ...,  n  }  with  0  <  X\  <  ...  <xn  <  1  consider  the  prob- 
lem of  minimizing 


R\[g]  =  j:\yi-g(xi)\+xl1\g~(x)\dx 


(2.1) 


overg  e  Cl[0,  1]  ,  the  space  of  continuous  functions  on  [0,  1]  with  continuous  first  deriva- 
tive. 

Definition.  A  function  g  :  [0,  1]  — » R  is  a  parabolic  spline  with  mesh 
0=^0  <*i  <  •••  <xn  <Xn+i  =  1  if  g  €  C2[0,  1]  and  g(x)  is  piecewise  quadratic  in  the 
intervals  [jc,-,  *,+i),  /  =  0,  ...,  n,  that  is  ,  g  has  the  form 

g  (x)  =  a,(jt  -  Xi)2  +  P/(x  -  x,0  +  7/     JC;  ^jc  <  xi+1       i  =  0 n  (2.2) 

Theorem.  There  exists  a  parabolic  spline  £  which  solves  (2.1). 

Proof.  Suppose  g  solves  (2.1).  We  will  show  that  there  exists  a  parabolic  spline  g  such  that 
R[g]=R[g]'  Suppose,  provisionally,  that  sgn(g"(x))  is  constant  on  the  intervals 
[jc/,  Xj+i),  i  =  0,  ...,  n,  so  we  may  write 


j'\g'\x)\dx  =  j:\gxxi+l)-g\xi)\ 

0  /=0 


(2.3) 


Note  that  we  may  set  a,  =  xh\g  '(x«+i)  -  g  '(*i)V(*£+i  ~^/)  for  ^  =  0,  ...,  «  and  determine  the 
remaining  coefficients  of  g  from  the  conditions 

g(Xi+)=yi=g(*i+)  i-h  »,  n 

and 

g'0c,+)  =  P/=g'(xJ+)        z  =  l,  ...,  n 

The  parabolic  spline  g  thus  constructed  is  in  Cl[0,  1],  since  g  was,  and  clearly  achieves  the 
same  value  of  R.  Finally,  note  that  if  g  "  changes  sign  on  an  interval  between  knots,  say 
[JFjtX;+i),  the  same  construction  and  the  fact  that  \\f  I  >  if  implies  that  the  resulting  g 
satisfies  R  [g]  <  R [g]  which  contradicts  our  hypothesis  that  g  " could  change  sign.    □ 

Having  established  the  form  of  the  solution  to  (2.1),  it  is  straightforward  to  develop  an 
algorithm  to  compute  g.  Using  (2.2)  the  penalty  becomes 

\l\g'Xx)\dx=2±hi\ai\. 

where  h\  =a:,+1  -xt,  i  =  1,  ...,  n-\.  Note  that  like  other  smoothing  spline  problems  g  must 
be  linear  in  the  exterior  intervals  [0,  X\)  and  (r„,  1],  since  otherwise,  the  penalty  could  be 
reduced  without  affecting  the  "fidelity"  term.  From  the  continuity  conditions 

a.ih}  +  pihi+yi=yi+l 

2a,A  +  P,  =  P,+i  *  =  1,  ...."-2 

eliminating  the  P/'s  we  have 

Y1+2-Y1+1 

a(/2/  +  a/+1/2,+1  = 


hi+\ 


Yi+i  ~li 


hi 


i  =  1,  ...,  n-2. 


Including  the  exterior  segments  [0,X\)  and  [xn,\]  we  have  3(n+\)  free  parameters  and  In 
linear  continuity  constraints  plus  the  2  conditions  that  Oq  =  an  =  0,  which  leaves  us  with 
n+\  free  parameters.  Thus,  given  the  Y,-'s  we  have  one  "free"  a,-,  say,  cci,  and  we  may 
rewrite  (2.1)  as 


min  Y.\y  j  —  A-/JC  ,*0I 

e  e  R"+1 

wherey'as(y'.0'll_i),  9'=(ai,  yly ...,  yn),  X,=I(i<n)  +  Xf(i  >  n), 


(2.4) 


X  = 


0 


0 


(2/i-l)x(/i+l) 


,-1 


A=HD~lB,    H  =  diag(h), 


D  = 


1     0     •     •    •      •        0 

Ai  A2    0  0 

0    h2  h3  0 


0 


0  /!n_2   /in_i 


(n-l)x(n-l) 


and 


B  = 


10  0  0  0    0 

0  h\l  -(Kil+hix)         h2l  0    0 

-(hi1  +h~3l )  /zj1 


0 
0 


hi1 


0    0 


0 


-1 


-1 


0   h-l2    -(hnll+hnU)         h~U 


(n-l)x(/i+l) 


Clearly,  (2.4)  is  a  conventional  /  j  -regression  problem  and  can  be  solved  with  conventional 
software;  an  'S'  function  (Becker  and  Chambers  and  Wilks  (1988))  to  accomplish  this 
appears  in  the  Appendix. 

Once  we  notice  that  the  median  smoothing  spline  may  be  expressed  as  the  solution  to  a 
particular  l±  -regression  problem,  a  number  of  other  important  features  of  the  solution  are 
immediately  apparent.  We  have  an  /j  regression  with  (n+l)  parameters  and  (In  -  1)  obser- 
vations; solutions  must  have  n+l  residuals  which  are  zero  (by  complementary  slackness) 
and  in  our  case^  these  zero  residuals  correspond  to  either  (i)  exact  interpolation  of  an  obser- 
vation, so  v,-  =  Yi  or  (ii)  linearity  of  g  in  a  particular  subinterval  of  the  design  mesh,  so  a,-  =  0 
for  some  i.  Obviously,  the  parameter  X  controls  the  relative  "advantage"  of  these  two  alter- 
natives. When  X  is  sufficiently  large  all  the  a,-  will  be  zero  and  the  solution  will  correspond 
to  the  bivariate  linear  /pfiL    When  X  is  sufficiendy  small  all  n  observations  will  be 


interpolated,  and  all  but  one  of  the  a,-'s  can  be  non-zero. 

As  in  any  smoothing  problem,  choice  of  "bandwidth"  —  here  represented  by  the  param- 
eter X  —  is  critical.  For  median  smoothing  splines  and  quantile  smoothing  splines  in  gen- 
eral, this  problem  is  ameliorated  by  the  fact  that  the  whole  family  of  solutions  to  (2.1)  for 
X  e  [0,  «»)  may  be  easily  found  by  parametric  linear  programming.  Suppose  at  some  X  -  Xq 
we  consider  the  solution  g  satisfying  y,-  =g(x,-)  =?/,  i  =  1,  ...,  n.  As  X  increases  from  0  , 
this  solution  remains  optimal  as  long  as  the  subgradient  condition 

-Xj  <  (  ZjgnQi  -  XiX$)XiXiX~hl  )j  <  Xj  ^  ^ 

holds.  Here,  following  Bassett  and  Koenker  (1978),  we  use  h  to  index n+\  element  subsets 
of  K  =  { 1,  2,  ...,  In-  1 },  h  =K-h,  and  Xh  denotes  the  submatrix  of  X  with  row  indices  in 
h.  Noting  that  X;  =  1  for  ieh\,  and  for  ieh2,  it  follows  that  sg/i  (y/-k/Jt/9)  =  sgn (-/4;.9) 
since  X>0,  equation  (2.5)  may  be  rewritten  as 

Xj  <  rj  +  SjX  <Xj   j  =  1,  ...,  n+l 

with  n  and  Sj  the  yth  elements  of  the  vectors: 

r=  ^sgniyi-x^XiX'h 

S=  Zsgn(yi-XiQ)XiX~hl 
ieh2 

where  we  have  decomposed  h  =h\Kj  /i2  with  ^  c{l,  ...,  n)  and  h2  c  {n+l,  ...,  2/i— 1 }, 
so  the  next  X  is 

X\  =min{  min{max{X<),  (l-r;)/5;-,  -(l+r;)/5;} },  min{max{^,  -rj/(l+Sj),  r;/(l-5;)}}}. 
;'e/i,  jeh2 

Continuing  this  iteration  until  Xm+\  -Xm  yields  all  distinct  solutions  to  the  problem  (2.1) 
with  the  solution  for  X  >  Xm  corresponding  to  the  linear  / 1  fit  to  the  observations.  An  impor- 
tant implication  of  this  fact  is  that  we  may  initially  solve  the  simpler  linear  l\  problem 
corresponding  to  X  = «»  and  gradually  relax  the  roughness  penalty  with  a  sequence  of  sim- 
plex pivots,  thus  avoiding  a  direct  solution  of  the  potentially  rather  large  problem  (2.4). 

Each  "click"  to  a  new  solution  involves  a  single  simplex  pivot  of  an  extremely  sparse 
constraint  matrix,  and  hence  solving  (2.1)  for  a  broad  range  of  X  appears  quite  feasible.  One 
interesting  aspect  of  the  way  that  solutions  y(X)  depend  upon  the  penalty  parameter  X  con- 
cerns the  number  of  interpolated  points.  In  the  l2  smoothing  spline  literature  much  has  been 
made  of  the  "effective  dimensionality"  or  "degrees  of  freedom"  of  the  estimated  curves 
corresponding  to  various  X.  Such  measures  of  dimensionality  are  usually  based  on  the  trace 
of  various  quasi -projection  matrices  in  the  least-squares  theory.  See  Buja,  Hastie  and 
Tibshirani  (1989)  for  an  extensive  discussion.  For  the  median  smoothing  spline  the  connec- 
tion is  more  direct  in  the  sense  that  there  is  an  explicit  trade-off  between  the  number  of 
interpolated  points  and  the  number  of  linear  segments.  Since  "reasonable"  smoothing  sug- 
gests that  the  number  of  interpolated  points  is  small  relative  to  n,  it  is  probably  sensible  to 
start  the  parametric  programming  at  the  linear  /  \  -solution  rather  than  at  X  =  0. 

If  the  design  is  in  "general  position"  so  no  two  observations  share  the  same  design 
point,  there  must  be  at  least  2  and  at  most  n  interpolated  V/'s.  Call  this  number  p\.  Clearly, 
Px  is  a  plausible  measure  of  the  effective  dimension  of  the  fitted  model  with  penalty 


parameter  X,  and  n  -p\  +  1,  which  corresponds  to  the  number  of  linear  segments  in  the 
fitted  function,  is  a  plausible  measure  of  the  degrees  of  freedom  of  the  fit  Such  decomposi- 
tions might  be  used  in  conjunction  with  the  function  R  [|]  itself  to  implement  data-driven 
bandwidth  choice,  for  example,  along  the  lines  of  Akaike  (1974)  or  Schwarz  (1978).i 


3.      QUANTILES  AND  OTHER  EXTENSIONS 

Smoothing  splines  for  other  quantiles  may  be  estimated  by  replacing  the  I  *  I 's  in  the 
fidelity  term  with  the  Czech  function,  as  in  (1.2).  It  seems  natural  to  maintain  the  sym- 
metric form  of  the  penalty  term.  For  a  fixed  quantile  the  entire  path  of  solutions  in  the 
penalty  parameter  X  can,  again,  be  found  by  methods  of  parametric  linear  programming. 
Likewise  for  X  fixed  the  entire  set  of  distinct  solutions  for  a  e  [0,  1]  may  be  found  by  simi- 
lar methods. 

A  simple  approach  to  inference  using  these  methods  may  be  developed  along  the  lines 
suggested  in  Hendricks  and  Koenker  (1990)  for  regression  splines.  For  a  given  choice  of 
quantile,  a,  and  penalty,  X,  we  may  easily  compute  the  estimated  quantile  functions  for 
a±h,  X,  providing  a  tolerance  band  for  ga  ^.  A  Siddiqui  (1960)  estimate  may  then  be  com- 
puted at  each  design  point  for  the  sparsity  function  and  used  to  construct  an  estimate  of  the 
covariance  matrix  of  0„. 

There  are  a  number  of  intriguing  extensions  incorporating  further  constraints.  Mono- 
tonicity  and  convexity  of  the  fitted  function  g  may  be  readily  imposed  by  imposing  further 
inequality  constraints  on  the  parameters  of  the  problem,  using  the  variation  reducing  proper- 
ties of  splines  developed  in  Schumaker  (1981).  See  Ramsey  (1988)  for  a  discussion  of  some 
applications  of  these  ideas  to  the  h  smoothing  spline  case.  Note  however  that  while  adding 
such  inequality  constraints  to  the  1 2  problem  results  in  a  significant  increase  in  complexity 
adding  linear  inequality  constraints  to  the  quantile  smoothing  spline  problem  does  not  alter 
the  fundamental  nature  of  the  optimization  problem  to  be  solved. 

Finally,  there  is  the  extension  of  these  methods  to  multivariate  settings  where  the  addi- 
tive spline  models  of  Buja,  Hastie  and  Tibshirani  (1989)  and  others  naturally  suggest  them- 
selves. Clearly  the  nonlinear  character  of  the  present  smoothers  vitiate  the  attractive  itera- 
tive "backfitting"  algorithms  available  in  the  /2-case.  But  feasible  estimators  may  still  be 
possible  using  a  limited  number  of  simplex  pivots  from  an  initial  linear  (in  covariates)  quan- 
tile function  estimate. 

Consideration  of  the  asymptotic  performance  of  quantile  smoothing  splines  raises 
some  challenging  problems  which  we  hope  to  address  in  future  work.  Unfortunately,  it 
does  not  appear  easy  to  adapt  the  asymptotic  theory  of  /2-smoothing  splines  to  the  present 
case.  However,  the  asymptotic  linearity  results  of  Cox  (1983)  for  M-type  smoothing 
splines,  if  extendable  to  the  penalty,  may  suggest  a  way  forward.  The  recent  work  of  [ 
White(1990)  on  neural  network  models  for  conditional  quantile  functions  appears  to  offer  a 
relatively  straightforward  approach  to  consistency  for  quantile  smoothing  splines. 


4.      SOME  PICTURES  IN  LIEU  OF  A  CONCLUSION 

In  Figure  4.1  we  illustrate  several  estimates  corresponding  to  various  X's  of  the  median 
smoothing  spline  for  Example  1  of  Schuette  (1978)  which  is  a  problem  in  actuarial  gradua- 
tion, i.e.  smoothing  of  life  tables.  These  figures  match  closely,  but  not  exactly,  Schuette 's 
finite  difference  calculations  reported  in  his  Table  2.  The  interpolating  nature  of  the  solu- 
tions is  immediately  apparent  in  the  Figure. 


Figure  4.1:  Median  Smoothing  Splines  for  Schuette's  Example  1 


In  Figure  4.2  we  illustrate  several  estimates  of  the  median  smoothing  spline  for  the 
"motorcycle  data"  of  Schmidt,  Mattem  and  Schuler.  See  Hardle  (1990)  and  Silverman 
(1985)  for  discussions  of  li  smoothing  spline  analyses  of  this  data  set.  There  is  clearly 
some  increase  in  dispersion  in  y  for  x  e  [30,  40]  and  one  noticeable  difference  between 
these  median  estimates  and  the  mean  estimates  is  the  reduced  sensitivity  of  the  former  to  the 
outiying  points  below  the  curve.  Another  aspect  of  this  data  set  which  perhaps  deserves 
some  comment  is  the  fact  that  there  are  multiple  observations  at  some  design  points,  which 
we  have  assumed  away  in  the  treatment  of  Section  2.  This  is  rectified  in  the  S  functions 
given  in  the  Appendix  in  the  following  simple  fashion.  Knots  are  placed  at  each  distinct  x 
observation,  and  we  parameterize  the  spline  by  the  values  it  takes  at  those  knots,  and  the 
value  of  second  derivative  in  the  interval  between  the  first  two  knots.  Obviously,  the  iden- 
tity matrix  in  X  must  be  replaced  by  a  block  diagonal  matrix  with  columns  of  l's  in  the 
diagonal  blocks.  It  is  this  structure  which  implies  as  we  noted  in  the  introduction  that  the 
X  =  0  interpolate  solution  passes  through  the  sample  quantiles  of  the  distinct  design  points. 
Of  course,  if  the  x\  are  all  distinct  then  we  have  a  fully  interpolative  solution.  An  alternative 
to  this  approach  which  has  been  extensively  explored  in  the  spline  literature  is  to  introduce 
multiple  knots  at  the  same  design  point  which  then  has  the  effect  of  relaxing  the  continuity 
requirement  on  the  function  and  its  derivative.   We  note  in  conclusion  that  the  number  of 


interpolated  points,  p\y  for  the  3  illustrated  curves  is  41,  26,  23  for  X  =  .5,  3,  10,  respec- 
tively. Results  of  Portnoy  (1984)  and  Welsh  (1989)  suggest  thatp2//i  must  tend  to  zero  for 
consistency,  and  this  suggests  since  n  =  133  here  that  even  the  X  =  10  estimate  may  be 
undersmoothed.  Clearly,  a  salient  advantage  of  this  approach  over  the  more  straightforward 
fixed  knot  regression  spline  approach  is  that  the  data  is  allowed  to  choose  regions  in  which 
the  shape  of  the  function  changes  rapidly. 
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Figure  4.2:  Median  Smoothing  Splines  for  Motorcycle  Data 


Finally,  in  Figure  4.3  we  illustrate  quantile  smoothing  splines  for  three  distinct  quan- 
tiles:  a  =  .1,  .5,  .9.  The  smoothing  parameter  X  is  chosen  to  be  10  for  all  three  estimates, 
although  one  might  still  argue  that  the  resulting  curves  are  undersmoothed.  Clearly,  these 
estimates  reflect  substantial  heterogeneity  in  the  conditional  distribution  of  y  over  the 
observed  range  of  x,  thus  vindicating  the  methods  to  some  extent.  Clearly,  it  is  exactly  this 
sort  of  nonhomogeneity  that  we  would  hope  to  identify  with  quantile  methods.  The  compu- 
tations for  Figure  4.3  were  done  using  a  modified  version  of  the  regression  quantile  routine 
described  in  Koenker  and  d'Orey(1987)  which  implements  the  parametric  programming 
aspect  of  the  bandwidth  selection.  We  hope  to  report  further  on  this  algorithm  at  a  later 
date. 

One  feature  of  Figure  4.3  which  is  quite  striking  is  the  "near-piecewise  linearity"  of  the 
estimated  curves.  The  L\  roughness  penalty  is  perhaps  too  complaisant  about  the  sharp 
elbows  of  g,  allowing  g  "  to  be  large  as  long  as  this  doesn't  occur  over  long  intervals.  One 
possible  response  to  this  phenomenon  is  to  restrict  the  number  of  knots  to  a  more  regular 
mesh.  Another  possible  response  is  to  replace  the  L\  penalty  with  an  L„  penalty  which 
effectively  controls  the  supremum  of  g  "( • ).  This  problem  too  falls  within  the  linear  pro- 
gramming framework  and  we  are  currently  exploring  algorithms  for  its  solution. 
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Figure  4.3:  Quantile  Smoothing  Splines  for  Motorcycle  Data 
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5.      APPENDIX 


"llssh"<- 

function(x,   y,   w,    lambda) 

{ 

^Compute  11  smoothing  spline  with  penalty  parameter  lambda 
jjSolution  is  a  quadratic  spline: 

#  g(x)  =  a_i  *  (x  -  x_i)A2  +  b_i  *  (x  -  x_i)  +  c_i 
I Inputs: 

#  x — explanatory  variable 

#  y — explained  variable 

#  w — weights 

#  lambda — penalty  parameter 
^Outputs : 

#  xun — ordered  vector  of  x  with  no  duplicate  values 

#  g — smoothed  value  at  each  design  point  with 

#  g[l]  -  a_l,  g[i+l]  =  c_i 
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ny  <-  len(y) 
ox  <-  order (x) 

xun  <-  unique(x[ox]) 
h  <-  diff (xun) 
nh  <-  len(h) 
D  <-  diag(h,  nrow  -  nh) 

D[row(D)  —  col(D)  +  1]  <-  h[l:(nh  -  1)] 
D[l,  1]  <-  1 

B  <-  diag(l/h,  nrow  -  nh) 

B[row(B)  —  col(B)  +  1]  <-  -  (l/h[l: (nh  -  1) ]  +  l/h[2:(nh)]) 
B[row(B)  —  col(B)  +  2]  <-  l/h[2:(nh  -  1)] 
B  <-  cbind(c(0,  l/h[l],  rep(0,  nh  -  2)),  B) 
B[l,   ]  <-  0 
B  <-  cbind(0,  B) 
B[l,  1]  <-  1 

A  <-  diag(h)  %*%  solve(D)  %*%  B 
X  <-  matrix(0,  ny,  nh  +  1) 

X[cbind(l:ny,  category(x[ox]) ) ]  <-  1    #fidelity  part  of  design  matrix 
X  <-  rbind(cbind(0,  diag(w[ox])  %*%  X),  lambda  *  A)  #the  whole^  matrix 
return(x  -  xun,  g  =  llfit(X,  c(w[ox]  *  y[ox],  rep(0,  nh) ) ,  int  =  F) $ 
coef) 


"qspline"<- 
function(x,  g,  z) 

{ 

#compute  quadratic  spline  at  points  z  given  knots  at  x 

# Inputs: 

#  x — unique  vector  of  ordered  explanatory  variable  returned  from 

#  llssh(x,y,w, lambda) 

#  g — smoothed  value  at  each  unique  design  point,  component  returned 

#  from  llssh(x,y,w, lambda) 

#  z — points  where  smoothed  values  are  desired 
#Output: 

#  return  function  values  at  points  z 
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BIG  <-  le+301 

n  <-  len(x) 

x  <-  sort(x) 

h  <-  c(0,  diff(x)) 

a  <-  rep(0,  n  +  1) 

a[2]  <-  g[l] 

g[l]  <-  g[2] 

b  <-  rep(  -  a[2]  *  h[2]  +  (g[3]  -  g[2])/h[2],  n  +  1) 

for(i  in  3: (n  +  1) )  ( 

b[i]  <-  (2  *  (g[i]  -  g[i  -  l]))/h[i  -  1]  -  b[i  -  l] 
) 

a[3:n]  <-  diff(b[3:(n  +  l)])/(2  *  h[3:n]J 
k  <-  cut(z,  c(x,  BIG))  +  1      #obtain  bin  numbers  for  z 
k[z  <=  x[l]]  <-  1 
x  <-  c(x[l],  x) 
dz  <-  z  -  x[k] 
return(g[k]  +  b[k]  *  dz  +  a[k]  *  dzA2) 
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